conda list

Create spark session

Load images from local storage

# Total number of images totalMunber = images.count() print('Total number of images in train set {}'.format(totalMunber))

Retrieve labels from image path

Images enhancement

Color balance

Sharpness

Contrast

Brightness

Transfert learning (Resnet50)

Apply featurization to the DataFrame of images

images.show()

Enregistrement

End Spark session